There Is No Preview Available For This Item
This item does not appear to have any files that can be experienced on Archive.org.
Please download files in this item to interact with them on your computer.
Show all files
If you go to Amazon.com or the Apple Itunes store, your ability to search fornew music will largely be limited by the `query-by-metadata' paradigm: search by song, artist or album name. However, when we talkor write about music, we use a rich vocabulary of semantic conceptsto convey our listening experience. If we can model a relationship between these concepts and the audio content, then we can produce a more flexible music search engine based on a 'query-by-semantic-description' paradigm.
In this talk, I will present a computer audition system that can both annotate novel audio tracks with semantically meaningful words and retrieve relevant tracks from a database of unlabeled audio contentgiven a text-base query. I consider the related tasks of content-based audio annotation and retrieval as one supervised multi-class,multi-label problem in which we model the joint probability of acoustic features and words. For each word in a vocabulary, we use anannotated corpus of songs to train a Gaussian mixture model (GMM)over an audio feature space. We estimate the parameters of the modelusing the weighted mixture hierarchies Expectation Maximization algorithm. This algorithm is more scalable to large data sets andproduces better density estimates than standard parameter estimationtechniques. The quality of the music annotations produced by oursystem is comparable with the performance of humans on the same task. Our `query-by-semantic-description' system can retrieve appropriatesongs for a large number of musically relevant words. I also showthat our audition system is general by learning a model that canannotate and retrieve sound effects.
Lastly, I will discuss three techniques for collecting the semanticannotations of music that are needed to train such a computeraudition system. They include text-mining web documents, conductingsurveys, and deploying human computation games.